SOAP: Efficient Feature Selection of Numeric Attributes

نویسندگان

  • Roberto Ruiz Sánchez
  • Jesús S. Aguilar-Ruiz
  • José Cristóbal Riquelme Santos
چکیده

The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. Depending on the method to apply: starting point, search organization, evaluation strategy, and the stopping criterion, there is an added cost to the classification algorithm that we are going to use, that normally will be compensated, in greater or smaller extent, by the attribute reduction in the classification model. The algorithm (SOAP: Selection of Attributes by Projection) has some interesting characteristics: lower computational cost (O(mn log n) m attributes and n examples in the data set) with respect to other typical algorithms due to the absence of distance and statistical calculations; with no need for transformation. The performance of SOAP is analysed in two ways: percentage of reduction and classification. SOAP has been compared to CFS [6] and ReliefF [11]. The results are generated by C4.5 and 1NN before and after the application of the algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chi2: feature selection and discretization of numeric attributes

Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant attributes. This paper describes Chi2, a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data, and achieves feature selection via discretization. The empirical results demonstrate that Chi2 i...

متن کامل

Projection-based measure for efficient feature selection

The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. Depending on the method to apply: starting point, search organization, evaluation strategy, and the stopping criterion, there is an added cost to the classification algorithm that we are going...

متن کامل

Feature Selection via Discretization

| Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data. It achieves feature selection via dis-cretization. It can handle mixed attributes, work with mul...

متن کامل

A Novel Feature Selection Algorithm for Strongly Correlated Attributes Using Two-Dimensional Discriminant Rules

Considerable attention has been devoted to the development of feature selection algorithms for various applications in the last decade. Most of them concentrate to the single attributes. In contrast, limited research work has been devoted to determine correlated and pairwise attributes or features due to the difficulty of the problem. We present a novel feature selection algorithm for strongly ...

متن کامل

Fast Feature Selection by Means of Projections

The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. The algorithm (SOAP: Selection of Attributes by Projection) has some interesting characteristics: lower computational cost (O(m n log n) m attributes and n examples in the data set) with respe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002